122 research outputs found
MobiFace: A Novel Dataset for Mobile Face Tracking in the Wild
Face tracking serves as the crucial initial step in mobile applications
trying to analyse target faces over time in mobile settings. However, this
problem has received little attention, mainly due to the scarcity of dedicated
face tracking benchmarks. In this work, we introduce MobiFace, the first
dataset for single face tracking in mobile situations. It consists of 80
unedited live-streaming mobile videos captured by 70 different smartphone users
in fully unconstrained environments. Over bounding boxes are manually
labelled. The videos are carefully selected to cover typical smartphone usage.
The videos are also annotated with 14 attributes, including 6 newly proposed
attributes and 8 commonly seen in object tracking. 36 state-of-the-art
trackers, including facial landmark trackers, generic object trackers and
trackers that we have fine-tuned or improved, are evaluated. The results
suggest that mobile face tracking cannot be solved through existing approaches.
In addition, we show that fine-tuning on the MobiFace training data
significantly boosts the performance of deep learning-based trackers,
suggesting that MobiFace captures the unique characteristics of mobile face
tracking. Our goal is to offer the community a diverse dataset to enable the
design and evaluation of mobile face trackers. The dataset, annotations and the
evaluation server will be on \url{https://mobiface.github.io/}.Comment: To appear on The 14th IEEE International Conference on Automatic Face
and Gesture Recognition (FG 2019
UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition
Recently proposed robust 3D face alignment methods establish either dense or
sparse correspondence between a 3D face model and a 2D facial image. The use of
these methods presents new challenges as well as opportunities for facial
texture analysis. In particular, by sampling the image using the fitted model,
a facial UV can be created. Unfortunately, due to self-occlusion, such a UV map
is always incomplete. In this paper, we propose a framework for training Deep
Convolutional Neural Network (DCNN) to complete the facial UV map extracted
from in-the-wild images. To this end, we first gather complete UV maps by
fitting a 3D Morphable Model (3DMM) to various multiview image and video
datasets, as well as leveraging on a new 3D dataset with over 3,000 identities.
Second, we devise a meticulously designed architecture that combines local and
global adversarial DCNNs to learn an identity-preserving facial UV completion
model. We demonstrate that by attaching the completed UV to the fitted mesh and
generating instances of arbitrary poses, we can increase pose variations for
training deep face recognition/verification models, and minimise pose
discrepancy during testing, which lead to better performance. Experiments on
both controlled and in-the-wild UV datasets prove the effectiveness of our
adversarial UV completion model. We achieve state-of-the-art verification
accuracy, , under the CFP frontal-profile protocol only by combining
pose augmentation during training and pose discrepancy reduction during
testing. We will release the first in-the-wild UV dataset (we refer as WildUV)
that comprises of complete facial UV maps from 1,892 identities for research
purposes
Dynamic Face Video Segmentation via Reinforcement Learning
For real-time semantic video segmentation, most recent works utilised a
dynamic framework with a key scheduler to make online key/non-key decisions.
Some works used a fixed key scheduling policy, while others proposed adaptive
key scheduling methods based on heuristic strategies, both of which may lead to
suboptimal global performance. To overcome this limitation, we model the online
key decision process in dynamic video segmentation as a deep reinforcement
learning problem and learn an efficient and effective scheduling policy from
expert information about decision history and from the process of maximising
global return. Moreover, we study the application of dynamic video segmentation
on face videos, a field that has not been investigated before. By evaluating on
the 300VW dataset, we show that the performance of our reinforcement key
scheduler outperforms that of various baselines in terms of both effective key
selections and running speed. Further results on the Cityscapes dataset
demonstrate that our proposed method can also generalise to other scenarios. To
the best of our knowledge, this is the first work to use reinforcement learning
for online key-frame decision in dynamic video segmentation, and also the first
work on its application on face videos.Comment: CVPR 2020. 300VW with segmentation labels is available at:
https://github.com/mapleandfire/300VW-Mas
3D facial geometric features for constrained local model
We propose a 3D Constrained Local Model framework for deformable face alignment in depth image. Our framework exploits the intrinsic 3D geometric information in depth data by utilizing robust histogram-based 3D geometric features that are based on normal vectors. In addition, we demonstrate the fusion of intensity data and 3D features that further improves the facial landmark localization accuracy. The experiments are conducted on publicly available FRGC database. The results show that our 3D features based CLM completely outperforms the raw depth features based CLM in term of fitting accuracy and robustness, and the fusion of intensity and 3D depth feature further improves the performance. Another benefit is that the proposed 3D features in our framework do not require any pre-processing procedure on the data
- …